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Abstract. This paper presents a new numerical abstract domain for 
static analysis by abstract interpretation. This domain allows us to rep- 
resent invariants of the form (x — y < c) and (±x < c) , where x and y 
are variables values and c is an integer or real constant. 
Abstract elements are represented by Difference-Bound Matrices, widely 
used by model-checkers, but we had to design new operators to meet the 
needs of abstract interpretation. The result is a complete lattice of infinite 
height featuring widening, narrowing and common transfer functions. 
We focus on giving an efficient 0(n 2 ) representation and graph-based 
C(n 3 ) algorithms — where n is the number of variables — and claim that 
this domain always performs more precisely than the well-known interval 
domain. 

To illustrate the precision/cost tradeoff of this domain, we have imple- 
mented simple abstract interpreters for toy imperative and parallel lan- 
guages which allowed us to prove some non-trivial algorithms correct. 



1 Introduction 

Abstract interpretation has proved to be a useful tool for eliminating bugs in soft- 
ware because it allows the design of automatic and sound analyzers for real-life 
programming languages. While abstract interpretation is a very general frame- 
work, we will be interested here only in discovering numerical invariants, that is 
to say, arithmetic relations that hold between numerical variables in a program. 
Such invariants are useful for tracking common errors such as division by zero 
and out-of-bound array access. 

In this paper we propose practical algorithms to discover invariants of the 
form (x — y < c) and (±x < c) — where x and y are numerical program variables 
and c is a numeric constant. Our method works for integers, reals and even 
rationals. 

For the sake of brevity, we will omit proofs of theorems in this paper. The 
complete proof for all theorems can be found in the author's MS thesis [12] . 



Previous and Related Work. Static analysis has developed approaches to 
automatically find numerical invariants based on numerical abstract domains 



representing the form of the invariants we want to find. Famous examples are 
the lattice of intervals (described in, for instance, Cousot and Cousot's ISOP'76 
paper [1]) and the lattice of polyhedra (described in Cousot and Halbwachs's 
POPL'78 paper [5]) which represent respectively invariants of the form (v G 
[ci,C2]) and (ptiVx + ••• + a n v„ < c). Whereas the interval analysis is very 
efficient — linear memory and time cost — but not very precise, the polyhedron 
analysis is much more precise but has a huge memory cost — exponential in the 
number of variables. 

Invariants of the form (x — y < c) and (±x < c) are widely used by the model- 
checking community. A special representation, called Difference- Bound Matrices 
(DBMs), was introduced, as well as many operators in order to model-check 
timed automata (see Yovine's ES'98 paper |14j and Larsen, Larsson, Pettersson 
and Yi's RTSS'97 paper [10]). Unfortunately, most operators are tied to model- 
checking and are of little interest for static analysis. 

Our Contribution. This paper presents a new abstract numerical domain 
based on the DBM representation, together with a full set of new operators and 
transfer functions adapted to static analysis. 

Sections 2 and 3 present a few well-known results about potential constraint 
sets and introduce briefly the Difference-Bound Matrices. Section 4 presents op- 
erators and transfer functions that are new — except for the intersection operator — 
and adapted to abstract interpretation. In Section 5, we use these operators to 
build lattices, which can be complete under certain conditions. Section 6 shows 
some practical results we obtained with an example implementation and Section 
7 gives some ideas for improvement. 

2 Difference-Bound Matrices 

Let V = {v\, . . . , v n } be a finite set a variables with value in a numerical set I 
(which can be the set Z of integers, the set Q of rationals or the set R of reals). 
We focus, in this paper, on the representation of constraints of the form 
— Vi < c), (vi < c) and (vi > c), where Vi, Vj £ V and c S I. By choosing one 
variable to be always equal to 0, we can represent the above constraints using only 
potential constraints, that is to say, constraints of the form (vj — v.i < c) . From 
now, we will choose v%, . . . , v n to be program variables, and v\ to be the constant 
so that (vi < c) and (vi > c) are rewritten — V\ < c) and (i>i — < —c). We 
assume we now work only with potential constraints over the set {v\, . . . , v n }. 

Difference-Bound Matrices. We extend I to I = lU{+oo} by adding the +oo 
element. The standard operations <, =, +, min and max are extended to I as 
usual (we will not use operations, such as — or *, that may lead to indeterminate 
forms). 

Any set C of potential constraints over V can be represented uniquely by a n x 
n matrix in I — provided we assume, without loss of generality, that there does not 



exist two potential constraints (vj — Vi < c) in C with the same left member and 
different right members. The matrix m associated with the potential constraint 
set C is called a Difference-Bound Matrix (DBM) and is defined as follows: 

A J c if (vj — Vi < c) G C, 

y 1 +00 elsewhere . 

Potential Graphs. A DBM m can be seen as the adjacency matrix of a directed 
graph Q = (V, A, w) with edges weighted in I. V is the set of nodes, A C V 2 is 
the set of edges and w G A 1— ► I is the weight function. C? is defined by: 

J Uj) ^ ^4 if my = +00, 

1 (wi, Vj) G .4 and vj) = my if rriy ^ +00 . 

We will denote by (ix, ■ . ■ , ik) a finite set of nodes representing a path from 
node to node Vi k in C/. A cycle is a path such that i\ — ik- 

V-Domain and V°-Domain. We call the V-domain of a DBM m and we 

denote by T>(m) the set of points in I™ that satisfy all potential constraints: 

V(m) = {(xi, . . ., x n ) e V I Vi, j, Xj - x t < my} . 

Now, remember that the variable v\ has a special semantics: it is always 
equal to 0. Thus, it is not the V-domain which is of interest, but the V® -domain 
(which is a sort of intersection-projection of the V-domain) denoted by D°(m) 
and defined by: 

X>°(m) = {(x 2 ,...,x n ) G I" -1 I (0,x 2 ,...,x n ) e V(m)} . 

We will call V-domain and V° '-domain any subset of I" or I™ -1 which is 
respectively the V-domain or the V°-domahi of some DBM. Figure Q] shows an 
example DBM together with its corresponding potential graph, constraint set, 
V-domain and V°-domain. 

^ Order. The < order on I induces a point-wise order ^ on the set of DBMs: 

A . . 

m ^ n <^==> vi, j, rriij < riij . 

This order is partial. It is also complete if I has least-upper bounds, i.e, if I is R 
or Z, but not Q. We will denote by = the associated equality relation which is 
simply the matrix equality. 

We have m ^ n =>• V°(m) C V°(n) but the converse is not true. In 
particular, we do not have T>°(m) = T>°(n) m = n (see Figure [2] for a 

counter-example) . 




Fig. 1. A constraint set (a), its corresponding DBM (b) and potential graph (c), 
its V-domain (d) and V°-domain (e). 
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Fig. 2. Three different DBMs with the same V°-domain as in Figure[T] Remark 
that (a) and (b) are not even comparable with respect to $3. 



3 Closure, Emptiness, Inclusion and Equality Tests 

We saw in Figure [5] that two different DBMs can represent the same V°-domain. 
In this section, we show that there exists a normal form for any DBM with a 
non-empty V°-domain and present an algorithm to find it. The existence and 
computability of a normal form is very important since it is, as often in abstract 
representations, the key to equality testing used in fixpoint computation. In the 
case of DBMs, it will also allows us to carry an analysis of the precision of the 
operators defined in the next section. 



Emptiness Testing. We have the following graph-oriented theorem: 
Theorem 1. 

A DBM has an empty V° -domain if and only if there exists, in its associated 
potential graph, a cycle with a strictly negative total weight. □ 

Checking for cycles with a strictly negative weight is done using the well-known 
Bellman-Ford algorithm which runs in 0(n 3 ). This algorithm can be found in 
Cormen, Leiserson and Rivest's classical algorithmics textbook §25.3]. 



Closure and Normal Form. Let m be a DBM with a non-empty V° -domain 
and Q its associated potential graph. Since Q has no cycle with a strictly negative 
weight, we can compute its shortest path closure Q* , the adjacency matrix of 
which will be denoted by m* and defined by: 

m* = 0, 

JV-l 

m *j = min J2m lklk+1 if i j . 

The idea of closure relies on the fact that, if (i = i\, 1%, ■ ■ ■ , ijv = j) is a path 
from Vi to Vj, then the constraint Vj — Vi < Ylk=i mi kik+i can ^ e derived from 
m by adding the potential constraints Vi k+1 — Vi k < m,i k i k+1 , 1 < k < N — 1. 
This is an implicit potential constraint which does not appear directly in the 
DBM m. When computing the closure, we replace each potential constraint 
Vj — Vi < m,ij,i 7^ j in m by the tightest implicit constraint we can find, and 
each diagonal element by (which is indeed the smallest value Vi — vt can reach) . 
In Figure [2] for instance, (c) is the closure of both the (a) and (b) DBMs. 

Theorem 2. 



1. m* = inf^{n | D°(n) = V°(m)}. 

2. T>°(m) saturates m* , that is to say: 

Vi,j, such that m*^ < +oo, 3(xi = 0,X2, ■ ■ ■ ,x n ) £ T>(m), xj — Xi = rn*j. 



□ 



Theorem [21 1 states that m* is the smallest DBM — with respect to ^ — that 
represents a given V°-domain, and thus the closed form is a normal form. Theo- 
rem [5J2 is a crucial property to prove accuracy of some operators defined in the 
next section. 

Any shortest-path graph algorithm can be used to compute the closure of 
a DBM. We suggest the straightforward Floyd- Warshall, which is described in 
Cormen, Leiserson and Rivest's textbook §26.2], and has a C(n 3 ) time cost. 

Equality and Inclusion Testing. The case where m or n or both have an 
empty V°-domain is easy; in all other cases we use the following theorem — which 
is a consequence of Theorem [2] 1 : 

Theorem 3. 

1. If m and n have non-empty V° -domain, T>°(m) = T>°(n) <^=> m* = n* . 

2. If m and n have non-empty V° 1 -domain, T> a (m) C T>°(n) m* ^ n. 

□ 

Besides emptiness test and closure, we may need, in order to test equality or 
inclusion, to compare matrices with respect to the point-wise ordering ^J. This 
can be done with a 0(n 2 ) time cost. 

Projection. We define the projection TT\ Vk (m) of a DBM m with respect to a 
variable Vk to be the interval containing all possible values of v G I such that 
there exists a point (x2, ■ ■ ■ , x n ) in the V -domain of m with Xk = v: 

ni Vk (m) = {x 6 I | 3(:E2, ■ ■ ■ ,x n ) £ V°(m) such that x = Xk} ■ 

The following theorem, which is a consequence of the saturation property of the 
closure, gives an algorithmic way to compute the projection: 

Theorem 4. 

If m has a non-empty V° -domain, then ni Vk (m) = [—m* kl ,m* lk ] 
(interval bounds are included only if finite). □ 

4 Operators and Transfer Functions 

In this section, we define some operators and transfer functions to be used in 
abstract semantics. Except for the intersection operator, they are new. The op- 
erators are basically point- wise extensions of the standard operators defined over 
the domain of intervals [3] . 

Most algorithms presented here are either constant time, or point-wise, i.e., 
quadratic time. 



Intersection. Let us define the point-wise intersection DBM m An by: 



(mf\n)ij = mm(rriij , riij) . 
We have the following theorem: 
Theorem 5. 

V°(mAn) =V°(m)nV°{n). □ 

stating that the intersection is always exact. However, the resulting DBM is 
seldom closed, even if the arguments are closed. 



Least Upper Bound. The set of V°-domains is not stable by uniorfj] so we 
introduce here a union operator which over- approximate its result. We define 
the point-wise least upper bound DBM m V ra by: 

(m\/n)ij = max (my, ray) . 

m V ra is indeed the least upper bound with respect to the ^1 order. The 
following theorem tells us about the effect of this operator on V°-domains: 

Theorem 6. 

1. V°(m Vra) D V°(m) UX>°(ra). 

2. If m and ra have non-empty V - domains, then 

(m*) V (ra*) = inf{o | P°(o) D V°(m) U V°(n)} 

and, as a consequence, 2?°((m*) V (ra*)) is the smallest V° -domain (with 
respect to the C ordering) which contains T>°(m) WD°(n). 

3. If m and ra are closed, then so is m V ra. 

□ 

Theorem [6]l states that T>°(m V ra) is an upper bound in the set of V°-domains 
with respect to the C order. If precision is a concern, we need to find the least 
upper bound in this set. Theorem [SJ 2 — which is a consequence of the saturation 
property of the closure — states that we have to close both arguments before 
applying the V operator to get this most precise union over-approximation. If 
one argument has an empty V°-domain, the least upper bound we want is simply 
the other argument. Emptiness tests and closure add a 0(n 3 ) time cost. 



V -domains are always convex, but the union of two V -domains may not be convex. 



Widening. When computing the semantics of a program, one often encounters 
loops leading to fixpoint computation involving infinite iteration sequences. In 
order to compute in finite time an upper approximation of a fixpoint, widening 
operators were introduced in P. Cousot's thesis [3J §4.1.2.0.4]. Widening is a sort 
of union for which every increasing chain is stationary after a finite number of 
iterations. We define the point-wise widening operator V by: 



, . A f ma if Tin < m 
(mVn)ij = 



00 elsewhere . 

The following properties prove that V is indeed a widening: 
Theorem 7. 

1. T>°(mVn) D V°(m) UD°(n). 

2. Finite chain property: 

Vm and V(nj)j g N, the chain defined by: 




is increasing for ^ and ultimately stationary. The limit I is such that I m 
and Vi, I ^ rij. 

□ 

The widening operator has some intriguing interactions with closure. Like the 
least upper bound, the widening operator gives more precise results if its right 
argument is closed, so it is rewarding to change Xi+i = s^Vni into Xi+i = 
XiV(rii*). This is not the case for the first argument: we can have sometimes 
£>°(mVn) £ P°((m*)Vn). Worse, if we try to force the closure of the first 
argument by changing Xi+i = xfJrii into Xi+i = (s,Vni)*, the finite chain 
property (Theorem 02) is no longer satisfied, as illustrated in Figure [3J 
Originally [1], Cousot and Cousot defined widening over intervals V by: 

[a,6]v[c,d] = [e,f], 



a f a if a < c, j a ] b if b > d, 



where: 



-00 elsewhere, 1 +00 elsewhere . 

The following theorem proves that the sequence computed by our widening is 
always more precise than with the standard widening over intervals: 

Theorem 8. 

// we have the following iterating sequence: 

\ xo = m*. \[yo,z ] = 7T|^(m), 

{ Xk+i = x k V(n k *), {[y k +i,zk+i] = [yk,Zk] V ir\ Vi (nk), 



A 

m = 




X2i = 




X2i + 1 = 




1 



1 



Fig. 3. Example of an infinite strictly increasing chain defined by Xq = 
m*, x i+1 = (xiVrii)*. 



then the sequence (xk)kefi is more precise than the sequence ([j/fe, Zk])keN in the 
following sense: 



Remark that the technique, described in Cousot and Cousot's PLILP'92 pa- 
per |7j , for improving the precision of the standard widening over intervals V can 
also be applied to our widening V. It allows, for instance, deriving a widening 
that always gives better results than a simple sign analysis (which is not the case 
of V nor V). The resulting widening over DBMs will remain more precise than 
the resulting widening over intervals. 

Narrowing. Narrowing operators were introduced in P. Cousot's thesis [31 
§4.1.2.0.11] in order to restore, in a finite time, some information that may 
have been lost by widening applications. We define here a point- wise narrowing 
operator A by: 



Vfc, n\ Vi (x k ) C [y k , z k ) . 



□ 




The following properties prove that A is indeed a narrowing: 



Theorem 9. 



1. IfV°(n) C T>°(m), then V°(n) C V°(mAn) C V°(m). 



2. Finite decreasing chain property: 

Vm and for any chain (rii)jgN decreasing for $3, the chain defined by: 

A 

Xo = m, 

A A 

x i+1 = XiArii, 
is decreasing and ultimately stationary. 

□ 

Given a sequence (n fe ) fceN such that the chain (D°(n k )) keN is decreasing 
for the C partial order (but not (n k ) keN for the $3 partial order), one way 
to ensure the best accuracy as well as the finiteness of the chain (xk)keN is 
to force the closure of the right argument by changing Xi+i = XiArii into 
Xi+i = XiA(rii*). Unlike widening, forcing all elements in the chain to be 
closed with Xi+i = (a^Anj)* poses no problem. 



Forget. Given a DBM m and a variable v k , the forget operator m\ Vk computes a 
DBM where all informations about v k are lost. It is the opposite of the projection 
operator 7T| t , fc . We define this operator by: 

( mm(m,ij, m tk + m kj ) if % ^ k and j ^ k, 
(m\ Vh )ij = | ]£i = j = k, 

I +oo elsewhere . 

The V°-domain of rn\ Vk is obtained by projecting V a (m) on the subspacc 
orthogonal to Ivt, and then extruding the result in the direction of vt: 

Theorem 10. 

V (m\ Vk ) = 

{(X 2 , ■ ■ -,X n ) Gl" 1 | 3x £ I, (X 2 , ■ ■ ■ ,Xk-l,X,Xk+l, ■ • . ,x n ) G V° (m)} . 

a 



Guard. Given an arithmetic equality or inequality g over {v2, ■ ■ ■ ,v n } — which 
we call a guard — and a DBM m, the guard transfer function tries to find a new 
DBM m( a ) the V°-domain of which is {s G 2?°(m) | s satisfies g}. Since this is, 
in general, impossible, we will only try to have: 

Theorem 11. 

v °( m (g)) 3 { s G T>°(m) | s satisfies g}. □ 
Here is an example definition: 
Definition 12. 

1. If g= (vj - v io < c) with i ^ j , then: 

A J min(rriij , c) if i = iq and j = jo, 



( \ A f 

l m K-". <^))'J - i 



iij elsewhere 



The cases g = (vj < c) and g = {—Vi < c) are settled by choosing respec- 
tively Iq = 1 and jo = 1 . 



□ 



2. If g = (v jo - v io = c) with i ^ j Q , then: 

A , , 
m (v ia --u ia =c) — \ rn (i> jQ -v io <c)){y iQ -v jo <-c) ■ 

The case g = (vj = c) is a special case where io = 1. 

3. In all other cases, we simply choose: 

m (g) = m . 

□ 

In all but the last — general — cases, the guard transfer function is exact. 

Assignment. An assignment Vk <— e(v2, ■ ■ ■ , v n ) is defined by a variable Vk and 
an arithmetic expression e over {v2, ■ ■ ■ ,v n }. 

Given a DBM m representing all possible values that can take the variables 
set {7J2, . . . ,v n \ at a program point, we look for a DBM, denoted by m(„ fc< _ e ), 
representing the possibles values of the same variables set after the assignment 
Vk <— e. This is not possible in the general case, so the assignment transfer 
function will only try to find an upper approximation of this set: 

Theorem 13. 

{(x 2 , . . . ,x k -!,e(x2, ■ . .,x n ),x k +i, ■ • . ,x n ) I (x 2 , ■ . . ,x n ) e T>°(m)} 
For instance, we can use the following definition for m,( Vio <_ e y. 
Definition 14. 

1. If e = Vi + c, then: 

A (rriij-c ifi = i ,jj£j , 
( m (v i0 ^v i0 +c))ij = Irriij+c ifi^i ,j=j , 
I rriij elsewhere . 

2. If e = Vj a + c with io ^ jo, then we use the forget operator and the guard 
transfer function: 

m (». -» J o+ c ) = (( m \".o)(».o-"™< c ))K-".»<-< : ) • 

The case e = c is a special case where we choose jo = 1 . 

3. In all other cases, we use a standard interval arithmetic to find an interval 
[— e~ , e + ], e + , e~ 6 I such that 

[-e~,e + ] 2 e(n V2 (m), ... ,n Vn (m)) 

and then we define: 

{e + if i = 1 and j = io, 

e~ if j = 1 and i = i , 

("m\v io )ij elsewhere . 

□ 

In all but the last — general — cases, the assignment transfer function is exact. 



Comparison with the Abstract Domain of Intervals. Most of the time, the 
precision of numerical abstract domains can only be compared experimentally 
on example programs (see Section 6 for such an example). However, we claim 
that the DBM domain always performs better than the domain of intervals. 

To legitimate this assertion, we compare informally the effect of all abstract 
operations in the DBM and in the interval domains. Thanks to Theorems [5] and 
[HI 2, and Definitions [T2] and [TJ] the intersection and union abstract operators 
and the guard and assignment transfer functions are more precise than their 
interval counterpart. Thanks to Theorem [HI approximate fixpoint computation 
with our widening V is always more accurate than with the standard widening 
over intervals V and one could prove easily that each iteration with our narrowing 
is more precise than with the standard narrowing over intervals. This means that 
any abstract semantics based on the operators and transfer functions we defined 
is always more precise than the corresponding interval-based abstract semantics. 

5 Lattice Structures 

In this section, we design two lattice structures: one on the set of DBMs and one 
on the set of closed DBMs. The first one is useful to analyze fixpoint transfer 
between abstract and concrete semantics and the second one allows us to design 
a meaning function — or even a Galois Connection — linking the set of abstract 
V°-domains to the concrete lattice V({v2, ...,«„} i— > I), following the abstract 
interpretation framework described in Cousot and Cousot's POPL'79 paper [5]. 

DBM Lattice. The set At of DBMs, together with the order relation 53 and the 
point-wise least upper bound V and greatest lower bound A, is almost a lattice. 
It only needs a least element _L, so we extend §3, V and A to Mi =>IU{l}iii 
an obvious way to get C, U and IT The greatest element T is the DBM with all 
its coefficients equal to +oo. 

Theorem 15. 

1- (M±, E, n, U, _L, T) is a lattice. 

2. This lattice is complete if (I, <) is complete (L or R, but not Qj. 

□ 

There are, however, two problems with this lattice. First, we cannot easily 
assimilate this lattice to a sub-lattice of V{{v2, ■ ■ ■ , v n } *— > I) as two different 
DBMs can have the same V°-domain. Then, the least upper bound operator U 
is not the most precise upper approximation of the union of two V°-domains 
because we do not force the arguments to be closed. 



Closed DBM Lattice. To overcome these difficulties, we build another lattice 
based on closed DBMs. First, consider the set M* ± of closed DBMs M* with a 
least element _L* added. Now, we define a greatest element T*, a partial order 
relation C*, a least upper bound U* and a greatest lower bound l~T in M*j_ by: 

-p* ,. A |0 if i = j, 
^ I +00 elsewhere . 



A 

m U n = 



J either 


m = _L*, 


| or 


m 7^ _L*, n ^ 


m 


if n = _L*, 


n 


if m = JL*, 


m V n 


elsewhere . 


_L* 


if m = _L* 


(m A n)* 


elsewhere . 



and m S3 n 



a f _L* if m = ±* or n = ±* or V°(m A n) 

m n* n = J 

Thanks to Theorem [21 1, every non-empty V°-domain has a unique represen- 
tation in M*; _L* is the representation for the empty set. We build a meaning 
function 7 which is an extension of T> (•) to A4*j_: 

a f0 if m = JL*, 

' [X>°(m) elsewhere . 

Theorem 16. 

1. (A4^_, C*, n*, U*, _L*, T*) is a lattice and 7 is one-to-one. 

2. If (I, <) is complete, this lattice is complete and 7 is meet-preserving: 
7([~|*^0 = niTC 2 -) I x € X}. We can — according to Cousot and Cousot 
Prop. 7] — build a canonical Galois Insertion: 

V({v 2 ,...,v n }^I) ±=~ Ml 

where the abstraction function a is defined by: 
a{D) = |~T { m e M* ± \ D C 7(771) }. 



□ 

The Ml lattice features a nice meaning function and a precise union approx- 
imation; thus, it is tempting to force all our operators and transfer functions to 
live in Ml by forcing closure on their result. However, we saw this does not work 
for widening, so fixpoint computation must be performed in the M± lattice. 



6 Results 

The algorithms on DBMs presented here have been implemented in OCaml and 
used to perform forward analysis on toy — yet Turing-equivalent — imperative and 
parallel languages with only numerical variables and no procedure. 



We present here neither the concrete and abstract semantics, nor the actual 
forward analysis algorithm used for our analyzers. They follow exactly the ab- 
stract interpretation scheme described in Cousot and Cousot's POPL'79 paper 
[5] and Bourdoncle's FMPA'93 paper [T] and are detailed in the author's MS the- 
sis [12] . Theorems [1] [3j [3 [6l [TT] and [13] prove that all the operators and transfer 
functions wc defined are indeed abstractions on the domain of DBMs of the usual 
operators and transfer functions on the concrete domain / P({«2, ■ ■ • , v n } i— > I), 
which, as shown by Cousot and Cousot [5], is sufficient to prove soundness for 
analyses. 

Imperative Programs. Our toy forward analyzer for imperative language fol- 
lows almost exactly the analyzer described in Cousot and Halbwachs's POPL'78 
paper [8], except that the abstract domain of polyhedra has been replaced by 
our DBM-based domain. We tested our analyzer on the well-known Bubble Sort 
and Heap Sort algorithms and managed to prove automatically that they do 
not produce out-of-bound error while accessing array elements. Although wc did 
not find as many invariants as Cousot and Halbwachs for these two examples, it 
was sufficient to prove the correctness. We do not detail these common examples 
here for the sake of brevity. 

Parallel Programs. Our toy analyzer for parallel language allows analyzing a 
fixed set of processes running concurrently and communicating through global 
variables. We use the well-known nondeterministic interleaving method in order 
to analyze all possible control flows. In this context, we managed to prove au- 
tomatically that the Bakery algorithm, introduced in 1974 by Lamport [9], for 
synchronizing two parallel processes never lets the two processes be at the same 
time in their critical sections. We now detail this example. 

The Bakery Algorithm. After the initialization of two global shared variables 
2/1 and y2, two processes pi and p2 are spawned. They synchronize through the 
variables yl and y2, representing the priority of pi and p2, so that only one 
process at a time can enter its critical section (Figure [5]). 

Our analyzer for parallel processes is fed with the initialization code (yl = 0; 
y2 = 0) and the control flow graphs for pi and p2 (Figure [5]). Each control graph 
is a set of control point nodes and some edges labeled with either an action 
performed when the edge is taken (the assignment yl <— y2 + 1, for example) or 
a guard imposing a condition for taking the edge (the test yl ^ 0, for example). 

The analyzer then computes the nondeterministic interleaving of pi and p2 
which is the product control flow graph. Then, it computes iteratively the ab- 
stract invariants holding at each product control point. It outputs the invariants 
shown in Figure [6] 

The state (2,c) is never reached, which means that pi and p2 cannot be 
at the same time in their critical section. This proves the correctness of the 
Bakery algorithm. Remark that our analyzer also discovered some non-obvious 
invariants, such as yl = y2 + 1 holding in the (1, c) state. 



yl = 0;y2 = 0; 
(Pi) 



while true do 

yl = y2 + 1; 

while y2 7^ and yl > y2 do done; 

critical section 

2/1 = 0; 
done 



(p2) 



while true do 

2/2 = 2/1 + 1; 

while yl 7^ and y2 > yl do done; 

critical section 

2/2 = 0; 
done 



Fig. 4. Pseudo-code for the Bakery algorithm. 



yl^O 




yl <- y2 + 1 



1 ) y2 / and yl > y2 



y2 = or yl < y2 



critical section 



y2^0 



y2<-yl + l 




yl 7^ and y2 > yl 



yl = or y2 < yl 



c • critical section 



(Pi) 



(P2) 



Fig. 5. Control flow graphs of processes pi and p2 in the Bakery algorithm. 
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(0,c) 
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2/2 > 1 
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(1,6) 


(l,c) 


yl > 1 


2/1 > 1 


yl > 2 


2/2 = 


2/2 > 1 


2/2 >1 






yl- 2/2 = 1 


(2, a) 


(2,6) 


(2,c) 


2/1 > 1 


yl > 1 




2/2 = 


2/2 > 1 


1 




J/1-2/2 6 [-1,0] 





Fig. 6. Result of our analyzer on the nondeterministic interleaving product graph 
of pi and p2 in the Bakery algorithm. 



7 Extensions and Future Work 

Precision improvement. In our analysis, we only find a coarse set of the 
invariants held in a program since finding all invariants of the form (x — y < c) 
and (±:r < c) for all programs is non-computable. Possible losses of precision 
have three causes: non-exact union, widening in loops and non-exact assignment 
and guard transfer functions. 

We made crude approximations in the last — general — case of Definitions [12] 
and [T3] and there is room for improving assignment and guard transfer functions, 
even though exactness is impossible. When the DBM lattices are complete, there 
exists most precise transfer functions such that Theorems [TT1 and [T51 hold, how- 
ever these functions may be difficult to compute. 

Finite Union of V°-domains. One can imagine to represent finite unions of 
V°-domains, using a finite set of DBMs instead of a single one as abstract state. 
This allows an exact union operator but it may lead to memory and time cost 
explosion as abstract states contain more and more DBMs, so one may need 
from time to time to replace a set of DBMs by their union approximation. 

The model-checker community has also developed specific structures to rep- 
resent finite unions of V-domains, that are less costly than sets. Clock- Difference 
Diagrams (introduced in 1999 by Larsen, Weise, Yi and Pearson [TT]) and Dif- 
ference Decision Diagrams (introduced in M0ller, Lichtenberg, Andersen and 
Hulgaard's CSL'99 paper [T3]) are tree-based structures made compact thanks 
to the sharing of isomorphic sub-trees; however existence of normal forms for 
such structures is only a conjecture at the time of writing and only local or 



path reduction algorithms exist. One can imagine adapting such structures to 
abstract interpretation the way we adapted DBM in this paper. 

Space and Time Cost Improvement. Space is often a big concern in abstract 
interpretation. The DBM representation we proposed in this paper has a fixed 
0(n 2 ) memory cost — where n is the number of variables in the program. In the 
actual implementation, we decided to use the graph representation — or hollow 
matrix — which stores only edges with a finite weight and observed a great space 
gain as most DBMs wc use have many +00. Most algorithms are also faster 
on hollow matrices and we chose to use the more complex, but more efficient, 
Johnson shortest-path closure algorithm — described in Cormen, Leiserson and 
Rivest's textbook [2 §26.3] — instead of the Floyd- Warshall algorithm. 

Larsen, Larsson, Pettersson and Yi's RTSS'97 paper [lOj presents a minimal 
form algorithm which finds a DBM with the fewest finite edges representing a 
given V°-domain. This minimal form could be useful for memory-efficient storing, 
but cannot be used for direct computation with algorithms requiring closed 
DBMs. 



Representation Improvement. The invariants we manipulate are, in term of 
precision and complexity, between interval and polyhedron analysis. It is inter- 
esting to look for domains allowing the representation of more forms of invariants 
than DBMs in order to increase the granularity of numerical domains. We are 
currently working on an improvement of DBMs that allows us to represent, with 
a small time and space complexity overhead, invariants of the form (±x±y < c). 

8 Conclusion 

We presented in this paper a new numerical abstract domain inspired from the 
well-known domain of intervals and the Difference-Bound Matrices. This domain 
allows us to manipulate invariants of the form (x — y < c) , (x < c) and (x > c) 
with a 0(n 2 ) worst case memory cost per abstract state and 0(n 3 ) worst case 
time cost per abstract operation (where n is the number of variables in the 
program). 

Our approach made it possible for us to prove the correctness of some non- 
trivial algorithms beyond the scope of interval analysis, for a much smaller cost 
than polyhedron analysis. We also proved that this analysis always gives better 
results than interval analysis, for a slightly greater cost. 
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